On Short-Time Estimation of Vocal Tract Length from Formant Frequencies
نویسندگان
چکیده
Vocal tract length is highly variable across speakers and determines many aspects of the acoustic speech signal, making it an essential parameter to consider for explaining behavioral variability. A method for accurate estimation of vocal tract length from formant frequencies would afford normalization of interspeaker variability and facilitate acoustic comparisons across speakers. A framework for considering estimation methods is developed from the basic principles of vocal tract acoustics, and an estimation method is proposed that follows naturally from this framework. The proposed method is evaluated using acoustic characteristics of simulated vocal tracts ranging from 14 to 19 cm in length, as well as real-time magnetic resonance imaging data with synchronous audio from five speakers whose vocal tracts range from 14.5 to 18.0 cm in length. Evaluations show improvements in accuracy over previously proposed methods, with 0.631 and 1.277 cm root mean square error on simulated and human speech data, respectively. Empirical results show that the effectiveness of the proposed method is based on emphasizing higher formant frequencies, which seem less affected by speech articulation. Theoretical predictions of formant sensitivity reinforce this empirical finding. Moreover, theoretical insights are explained regarding the reason for differences in formant sensitivity.
منابع مشابه
On instantaneous vocal tract length estimation from formant frequencies
The length of the vocal tract and its relationship with formant frequencies is examined at fine temporal scales with the goal of providing accurate estimates of vocal tract length from acoustics on a spectrum-by-spectrum basis despite unknown articulatory information. Accurate vocal tract length estimation is motivated by applications to speaker normalization and biometrics. Analyses presented ...
متن کاملDetermination of human vocal-tract dynamic geometry from formant trajectories using spatial and temporal Fourier analysis
This article presents a method of estimation of the vocal-tract cross-sectional area, considered as a function of time and position along the tract length. The estimation is based on the speech formant frequencies, and uses a priori information about natural tract con gurations. In general lines, the method is as follows: First, the cross-sectional area is represented by a two-dimensional Fouri...
متن کاملCorrelation between vocal tract length, body height, formant frequencies, and pitch frequency for the five Japanese vowels uttered by fifteen male speakers
We conducted quantitative analyses of a magnetic resonance imaging (MRI) database to examine the correlation between physical measures (vocal tract length and body height) and acoustic parameters (pitch and formant frequencies) of vowels. The vocal tract length was measured from MRI data for the five Japanese vowels produced by fifteen male Japanese speakers between the ages of 24 and 55. The a...
متن کاملAn experiment in vocal tract length estimation
December 13-15, 2007: Firenze, Italy, ed. by C. Manfredi, ISBN 978 88-8453-673-3 (print) ISBN 978-88-8453-674-7 (online) © Firenze university press, 2007. Abstract: The presentation concerns the estimation of the vocal tract length of a speaker on the base of her formant frequencies and the formant frequencies and known tract length of a reference speaker. The length prediction is founded on a ...
متن کاملFormant estimation in children's speech and its application for a Spanish speech therapy tool
This paper addresses the problem of how to estimate reliable formant frequencies in high-pitched speech (typical in children), and how to normalize these estimations, independent from vocal tract shape or length. The normalized formant frequencies are used to improve the performance of a ComputerAided Speech Therapy Tool (CASTT) in Spanish. For this purpose, a study was conducted to see what is...
متن کامل